Statistical Applications in Genetics and Molecular Biology
نویسندگان
چکیده
We investigate an important issue of a meta-algorithm for selecting variables in the framework of microarray data. This wrapper method starts from any classification algorithm and weights each variable (i.e. gene) relative to its efficiency for classification. An optimization procedure is then inferred which exhibits important genes for the studied biological process. Theory and application with the SVM classifier were presented in Gadat and Younes, 2007 and we extend this method with CART. The classification error rates are computed on three famous public databases (Leukemia, Colon and Prostate) and compared with those from other wrapper methods (RFE, lo norm SVM, Random Forests). This allows the assessment of the statistical relevance of the proposed algorithm. Furthermore, a biological interpretation with the Ingenuity Pathway Analysis software outputs clearly shows that the gene selections from the different wrapper methods raise very relevant biological information, compared to a classical filter gene selection with T-test.
منابع مشابه
Strategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملStrategies and Clinical Applications of Next Generation Sequencing
Abstract DNA sequencing is one of the great valuable techniques in molecular biology, which can be used to detect the sequence of nucleotides in a DNA fragment. The high-throughput sequencing known as Next Generation Sequencing (NGS) revolutionized genomic research and molecular biology; therefore, the whole human genome can be sequenced with a low cost in several days. NGS technology is simi...
متن کاملSLC2A4 Polymorphisms Can Be a New Molecular Biomarker for Sports Genomics
"SLC2A4 Polymorphisms Can Be a New Molecular Biomarker for Sports Genomics" is an "Editorial Article" and hasn't abstract.
متن کاملStatistical Applications in Genetics and Molecular Biology
This note is a comment on the article “Dimension Reduction for Classification with Gene Expression Microarray Data” that appeared in Statistical Applications in Genetics and Molecular Biology (Dai et al., 2006).
متن کاملExpression Analysis of PKS13, FG08079.1 and PKS10 Genes in Fusarium graminearum and Fusarium culmorum
Background: Identification and quantification of mycotoxins produced by Fusarium species are important in controlling fungal diseases. Objectives: Potential of zearalenone, butenolide and fusarin C production was investigated in five Fusarium graminearum and five F. culmorum isolates at molecular level. Materials and Methods: Presence of PKS13, FG08079.1 and PKS10 genes, associated with produ...
متن کاملMolecular Epidemiology of Breast Cancer among Iranian-Azeri Population based on P53 Research
Background: This study was done in order to enhance our understanding about molecular and epidemiological features of breast cancer among the Azeri population with special emphasis on the detection of TP53 mutations. We also analyzed the role of the P53codon72 polymorphism (rs1042522) and its role in susceptibility to breast cancer. Methods: ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010